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Abstract 

We  formulate  the  problem  of  estimating  the  motion  of  a  rigid  object  viewed  under  perspective 
projection  as  the  identification  of  a  dynamic  model  in  Exterior  Differential  form  with  parameters 
on  a  topological  manifold. 

We  first  describe  a  general  method  for  recursive  identification  of  nonlinear  implicit  systems 
using  prediction  error  criteria.  The  parameters  are  allowed  to  move  slowly  on  some  topological 
(not  necessarily  smooth)  manifold.  The  basic  recursion  is  solved  in  two  different  ways:  one  is 
based  on  a  simple  extension  of  the  traditional  Kalman  Filter  to  nonlinear  and  implicit  measure¬ 
ment  constraints,  the  other  may  be  regarded  as  a  generalized  “Gauss-Newton”  iteration,  akin  to 
traditional  Recursive  Prediction  Error  Method  techniques  in  linear  identification.  A  derivation 
of  the  “Implicit  Extended  Kalman  Filter”  (IEKF)  is  reported  in  the  appendix. 

The  ID  framework  is  then  applied  to  solving  the  visual  motion  problem:  it  indeed  is  possible 
to  characterize  it  in  terms  of  identification  of  an  Exterior  Differential  System  with  parameters 
living  on  a  Co  topological  manifold,  called  the  “essential  manifold” .  We  consider  two  alterna¬ 
tive  estimation  paradigms.  The  first  is  in  the  local  coordinates  of  the  essential  manifold:  we 
estimate  the  state  of  a  nonlinear  implicit  model  on  a  linear  space.  The  second  is  obtained  by  a 
linear  update  on  the  (linear)  embedding  space  followed  by  a  projection  onto  the  essential  mani¬ 
fold.  These  schemes  proved  successful  in  performing  the  motion  estimation  task,  as  we  show  in 
experiments  on  real  and  noisy  synthetic  image  sequences. 


1  Introduction 

The  “visual  motion  estimation”  is  concerned  with  reconstructing  the  motion  of  the  viewer  relative 
to  the  environment  from  its  projections  onto  the  retina  (or  CCD  surface).  The  task  may  be 

‘Research  funded  by  the  California  Institute  of  Technology,  an  AT&T  Foundation  Special  Purpose  grant,  ONR 
grant  N0014-93-1-0990  and  grant  ASI-RS-103  from  the  Italian  Space  Agency.  This  work  is  registered  as  Technical 
Report  CIT-CDS  94-004,  California  Institute  of  Technology,  1994.  Submitted  to  the  invited  session  on  “Dynamic 
Vision,  System  Theoretical  Methods  and  Control  Applications”  at  the  33rd  IEEE  conf.  on  Decision  and  Control, 
Florida,  1994. 
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separated  into  two  steps:  first  establish  which  point  on  the  retina  corresponds  to  which  across 
time  (correspondence  problem),  and  then  estimate  the  motion  of  the  viewer  and  the  structure  of 
the  environment  from  the  correspondence.  This  classification  is  rather  arbitrary;  it  is  convenient, 
however,  to  assume  that  the  correspondence  has  been  solved  in  order  to  concentrate  on  the  geometric 
structure  of  the  visual  motion  problem.  For  a  review  of  the  existing  methods  for  addressing  the 
correspondence  problem,  see  for  example  [2]. 

Visual  motion  estimation  is  a  key  task  in  many  control  applications  involving  the  interac¬ 
tion  with  the  environment,  such  as  autonomous  robot  navigation  [14,  16,  15],  visual-based  track¬ 
ing/servo  [24,  25,  37],  visual-based  manipulation  [4,  25],  docking  [28,  15],  visual-based  planning  [11], 
active  sensing  [59].  In  recent  years  the  problem  has  been  addressed  using  nonlinear  estima¬ 
tion/identification  techniques  [40,  27,  45,  3,  57,  55]. 

In  order  to  formalize  the  problem  and  cast  it  into  a  system-theoretic  framework,  we  need  to 
specify  a  “description”  for  the  scene  and  for  the  motion  of  the  viewer.  Based  on  which  scene 
descriptors  are  used,  the  existing  methods  for  motion  estimation  may  be  classified  as  point-based, 
line-based,  curve-based  or  model-based.  We  are  interested  in  the  simplest  case  when  the  scene 
is  described  by  a  number  of  feature  points  in  the  euclidean  3D  space.  For  line-based  schemes 
see  [68,  62]  and  references  therein.  The  curve-based  approach  has  been  addressed  in  [1,  61,  9]. 

The  point-based  methods  may  be  further  classified  based  on  the  camera  model  employed.  The 
simplest  cases  assume  either  parallel  projection  [65,  64,  63,  51]  or  ideal  perspective  projection 
(pinhole  model,  see  [18]  for  a  review).  More  articulated  camera  models  as  general  homographies 
allow  parallel  and  perspective  projection  as  a  subcase  [3,  60,  20,  54].  We  will  be  mostly  concerned 
with  the  classical  pinhole  model,  although  it  is  possible  to  generalize  our  schemes  to  more  general 
camera  representations  and  estimate  the  camera  model  along  with  visual  motion  (self- calibration, 
see  [54,  20]).  Recent  schemes  recover  projective,  non-metric  structure  and  motion  independent  on 
the  camera  parameters  [17,  51,  42]. 

Motion  reconstruction  schemes  may  be  further  classified  based  on  the  data  processing  technique 
as  either  2-frames  schemes  (see  for  example  [39,  29,  66]),  multiframe-batch  methods  [65,  60]  and 
recursive  algorithms. 

In  the  last  decade  a  variety  of  schemes  has  been  proposed  for  reconstructing  recursively  structure 
for  known  motion  [40],  motion  for  known  structure  [21,  22,  6]  or  both  structure  and  motion  [27,  3, 
45,  57,  55].  In  this  paper  we  unify  them  into  a  common  framework  and  highlight  the  limitations 
of  the  model  employed,  which  motivate  the  new  formalization  in  terms  of  identification  (ID)  of 
Exterior  Differential  Systems  (EDS)  [7]  which  we  introduce. 

We  will  then  address  a  general  technique  for  performing  the  ID  of  EDS  using  nonlinear  prediction 
error  criteria,  and  we  will  apply  it  to  the  visual  motion  problem.  We  will  see  how  other  problems 
in  computational  vision  may  be  formulated  in  the  framework  of  ID  of  EDS  and  solved  with  the 
technique  presented  in  this  paper. 

Organization  of  the  paper 

In  section  2  we  will  cast  the  visual  motion  problem  into  a  system-theoretic  framework  in  terms  of 
state  estimation  of  a  nonlinear  dynamical  systems  with  a  differentiable  state-manifold.  Motivated 
by  the  structural  limitation  of  the  model  which  defines  the  visual  motion  problem  in  the  case  of 
feature  points  in  the  euclidean  3D  space,  we  will  develop  a  new  formalization  of  the  problem  in 
terms  of  identification  of  an  EDS  with  the  parameters  on  a  topological  manifold,  called  the  “essential 
manifold” .  We  will  also  present  two  examples  of  other  problems  in  computer  vision  which  may  be 
cast  as  the  identification  of  an  EDS. 
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In  section  4  we  will  analyze  the  estimation  problem  in  general  form  and  develop  a  suboptimal 
technique  for  recursive  identification  of  nonlinear  implicit  systems  nonlinear  in  the  parameters  using 
prediction  error  methods  (PEM)  [58].  The  framework  is  that  of  approximate  maximum  likelyhood 
or  least  squares  identification  using  observers  [33,  8,  31,  10,  48,  47,  32],  extended  to  Differential 
Algebraic  Equations  with  parameters  on  topological  manifolds.  We  use  a  variation  of  the  Extended 
Kalman  Filter  for  implicit  measurement  constraints,  called  IEKF,  which  is  derived  in  the  appendix. 

In  section  5  we  apply  the  method  to  the  visual  motion  problem.  We  propose  three  schemes  for 
performing  the  estimation  task:  the  first  consists  of  writing  the  estimator  in  the  local  coordinates 
of  the  parameter  manifold,  and  then  applying  the  IEKF.  Alternatively  we  write  the  update  in 
the  embedding  space  and  project  it  at  each  step  onto  the  parameter  manifold.  A  third  scheme  is 
based  on  a  double  iteration  and  corresponds  to  the  extension  of  the  usual  least-squares  PEM  via 
Gauss-Newton  iteration.  The  theoretical  observability/identifiability  of  such  schemes  is  addressed 
in  [52]. 

In  section  6  we  compare  the  performance  of  the  three  schemes  on  real  and  noisy  synthetic  image 
sequences. 

2  Visual  motion  estimation 

In  this  section  we  formalize  the  visual  motion  problem  when  the  structure  of  the  scene  is  repre¬ 
sented  as  a  set  of  feature  points  in  the  euclidean  3D  space.  We  restrict  our  attention  to  “static” 
environments,  or  equivalently  to  portions  of  the  scene  which  are  moving  rigidly.  In  such  a  case  the 
problem  is  “defined”  by  the  rigid  motion  constraint  and  the  perspective  map. 

2.1  Formulation  in  terms  of  state  estimation 

Consider  a  rigid  set  of  feature  points  in  3D  space  with  respect  to  some  cartesian  frame,  for  example 
the  one  moving  with  the  observer.  We  call  X'  =  ^  X  Y  Z  J  .  €  IR3  the  coordinates  of  the  iTH 
point,  and  we  let  i  =  1  :  N.  As  the  camera  moves  between  two  discrete  time  instants,  with  rotation 
R  and  translation  T,  the  coordinates  change  according  to  the  rigid  motion  constraint: 

X.\t  +  1)  =  R(t)X\t)  +  T(t)  V  i  =  1  :  N  (1) 

where  motion  is  represented  by  ( R,T )  £  SE( 3)  [43]. 

The  camera  (or  eye)  is  represented  by  a  map  from  the  3D  space  onto  some  2D  surface.  We 
adopt  for  simplicity  the  ideal  perspective  projection  model  [5]: 

7T  :  IR3  UP2 

X  i-»  7r(X)  =  x  y  1  j  =  x  =  y  1  I  (2) 

we  measure  x  up  to  some  error: 

y  =  x  -( -n  n  €  A^O,  Rn ). 

The  representation  thus  proposed  is  the  very  simplest  one  can  immagine;  however,  we  will  show 
that  it  is  not  the  most  appropriate  for  motion  estimation. 

The  equations  (1,2)  may  be  regarded  as  a  dynamical  model  describing  the  motion  of  points  in 
3D  space,  having  a  projection  as  measurement  equation.  Motion  is  the  input  to  the  system,  and 
hence  the  estimator  should  “invert”  the  model  in  order  to  reconstruct  motion  from  time  varying 
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projection  of  feature  points.  Since  the  initial  condition  (structure  at  time  zero)  is  not  known,  we 
have  a  combined  “inversion/estimation”  problem.  It  can  be  shown  [52]  that  any  inverse  system 
for  (1,2)  is  essentially  instantaneous,  hence  it  does  not  exploit  recursiveness  and  its  benefits.  This 
is  due  to  the  fact  that  the  model  above  is  driftless  [30,  44];  a  common  trick  is  then  to  use  dynamic 
extension.  We  augment  (1)  with 

/  R(t  +  1)  =  R{t )  +  nR(t)  /o\ 

\  r(*  +  l)  =  T(f)  +  nr(0  { 

G  SO( 3)  ,  nx  G  1R3.  Once  inserted  motion  into  the  state  dynamics  we  have  transformed  the 
motion  problem  into  a  state  estimation  task  for  a  dynamical  system  with  unknown  inputs,  since 
we  do  not  know  hr  and  n x- 

If  we  have  a  dynamical  model  available,  as  for  example  when  the  camera  is  mounted  on  a 
moving  vehicle,  we  may  exploit  it  in  place  of  (3).  In  lack  of  a  mechanical  model  we  may  imply  a 
statistical  model,  for  example  a  fixed  order  random  walk.  The  extreme  case  is  nx  =  0,  ur  =  0, 
which  corresponds  to  constant  velocity  motion. 

A  common  model  is  a  first  order  random  walk,  which  describes  a  brownian  motion.  For  instance 
we  may  assume  nj  G  M(0,Rt)  and  ur  =  enRA  G  50(3)  with  Hr  G  Af(0,Rn).  All  of  these  are 
modeling  assumptions,  and  they  must  be  verified  a  posteriori. 

A  fundamental  issue  in  deriving  a  state  estimator  (observer)  is  of  course  observability  [30,  44, 
34,  35,  36,  50].  The  observability  of  the  motion  problem  is  addressed  in  [52].  The  system  under 
investigation  (1,2,3)  has  the  peculiarity  of  not  only  having  a  linearization  which  is  not  observable, 
but  of  also  being  non  locally  weakly  observable.  We  need  to  impose  metric  constraints  on  the  state 
manifold  in  order  to  achieve  local  observability;  furthermore  the  observable  manifold  is  covered 
with  a  high  level  of  lie- differentiations,  which  makes  the  observer  porely  conditioned. 

Note  that  the  model  described  above  is  “block-diagonal”  with  respect  to  the  structure  parame¬ 
ters  X»,  and  any  observable  motion  combination  can  be  observed  regardless  the  number  of  visible 
points.  Indeed  it  is  strongly  intuitive  that  the  more  points  are  available,  the  more  robust  the  per¬ 
ception  (estimate)  of  motion  should  be.  Also  note  that,  once  motion  has  been  estimated,  structure 
is  linearly  observable  [52]  from  the  model  (1,2),  and  hence  a  simple  EKF  will  suffice  to  estimate 
it,  provided  that  we  keep  an  explicit  representation  of  the  second  order  statistics  of  the  motion 
estimation  errors  [45,  57]. 

These  considerations  motivate  the  introduction  of  a  new  model,  based  upon  a  motion  represen¬ 
tation  which  dates  back  to  Longuet- Higgins  [39]. 

2.2  Formulation  as  identification  of  an  exterior  differential  system 

A  rigid  scene  is  moving  with  T(t),R(t )  between  two  time  instants.  Then  it  is  immediate  to  see 
(fig.  1)  that  the  vector  X,  describing  the  coordinates  of  the  generic  point  at  time  t,  the  vector  X' 
of  coordinates  at  time  t  +  1  and  T,  are  coplanar,  and  therefore  their  triple  product  is  zero.  This  is 
true  of  course  also  for  x,  x'  and  T,  since  x  is  the  projective  coordinate  of  X  and  therefore  the  two 
are  coincident  in  Dl3,  interpreted  as  the  “ray-space”  model  [49].  When  expressed  with  respect  to  a 
common  reference,  for  example  that  at  time  t,  we  may  write  the  triple  product  as1 

x'f  /?(T  A  x,-)  =  0  V*  =  1:N.  (4) 

1Note  that  we  model  rigid  motion  with  T,  R  s.t.  X'  =  i?(X  —  T),  for  consistency  with  the  notation  of  [39]. 
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The  Essential  Constraint  for  rigid  Motion 


Figure  1:  The  essential  constraint 

It  turns  out  that  the  above  constraint  is  not  only  a  consequence  of  rigid  motion,  but  it  also  suffices 
to  characterize  it,  once  eight  or  more  constraints  are  given  [41,  39].  The  operator 

0  -t3  t2 

T  A  =  T3  0  —T\  =  S. 

-T2  1\  0 

belongs  to  so(3)  [43].  Following  Longuet-Higgins  [39]  we  call 

Q  =  R(TA) 

so  that  the  above  constraint,  which  we  now  call  the  “essential  constraint”,  becomes 

xfQx2  =  0  V*=1...JV.  (5) 

Estimating  motion  corresponds  to  identifying  the  model 

(Qxi)Tx/i  =  0  Q  eE 
Yi  =  Xf  +  rii  m  £  JV(0,  Rni), 

which  is  in  the  form  of  an  Exterior  Differential  System  [7];  the  parameters  T,R  are  encoded  in  Q. 
Since  the  constraint  (5)  is  linear  in  Q,  we  use  the  (improper)  notation 

Xx'(t),x(t)  Q(0 _  o 

where  x  is  an  iV  x  9  matrix  combining  Xi ,  x'  and  Q  is  interpreted  as  a  nine  dimensional  vector. 
The  generic  row  of  x  has  the  form  [aqaa  x2xi  xi  x\x'i  x<ix'i  x'2  X\  x2  1  ]. 

2.2.1  The  Essential  Space 

A  rigid  motion  may  be  represented  as  an  element  of  the  Lie  group  SE( 3),  which  is  naturally 
embedded  in  ]R4x4  via  homogeneous  coordinates: 

6  SE( 3)  C  IR16. 
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We  have  indeed  seen  that  rigid  motion  may  be  encoded  using  the  essential  constraint  (5)  based 
on  the  3x3  matrix  Q  =  R(T A)  C  1R9.  Since  we  can  reconstruct  translation  only  up  to  a  scale 
factor,  we  may  restrict  Q  to  belong  to  IRP8  instead  than  IR9.  It  is  customary  to  set  the  norm  of 
translation  to  be  unitary;  this  can  be  done  without  loss  of  generality,  as  long  as  translation  is  not 
zero.  The  zero-norm  translation  case  can  be  dealt  with  separately,  and  we  will  discuss  it  later.  Now 
for  simplicity  we  assume  ||Q||2  =  ||T||  =  1.  The  matrix  Q  belongs  to  the  space 

E  =  {RS\R  e  50(3),  5  =  TA  e  so(  3),  ||T||  =  1}CE9 


which  is  called  the  essential  space.  The  essential  space  encodes  rigid  motion  in  a  more  compact 
way  than  SE( 3),  the  price  being  that  we  loose  the  smooth  group  structure.  Indeed,  as  shown 
in  [55,  52],  a  slight  modification  of  E  proves  to  have  the  structure  of  a  topological  manifold.  For  let 
dx,x'(Q)  be  the  triangulation  function  which  gives  the  depth  of  a  point  from  its  motion  Q  and  its 
projective  coordinates  x,  x'.  Then  E  =  E  fi  cC*  dlRi)  is  a  topological  manifold  called  the  “essential 
manifold”  [55],  Call 


$  :  E  R5 

Q  ~  W  ,  0]T 


a  chart  of  the  local  coordinates  atlas  of  the  essential  manifold  (  see  [55]  for  an  explicit  character¬ 
ization  of  <&);  [V  ,  0,]T  represent  the  canonical  (exponential)  local  coordinates  of  (T,  R)  €  SE( 3) 
via 


R 

T 


e(fiA) 

1 

m 


[(/  —  e<nA) )  n  A  +MI2 


V. 


(6) 

(7) 


E  also  has  the  structure  of  an  algebraic  variety  [41],  which  we  will  not  discuss  in  this  paper. 


2.2.2  Motion  representation  on  the  essential  space 

A  rigid  motion  with  unit  norm  translation  may  be  represented  as  an  element  of  the  essential 
manifold  E.  For  non-unit  translations  (but  still  positive  norm),  it  is  sufficient  to  scale  Q  to  Q/||Tj|, 
since  the  singular  values  of  Q  are  {||T||,  ||T||,0}  (see  appendix  B).  At  each  time  instant  we  have  a 
set  of  N  constraints  in  the  form 

Xx'(t),x(t)  Q(^)  b, 

therefore  Q  lies  at  the  intersection  between  the  essential  manifold  and  the  linear  variety  x 
(see  fig.  2). 

Note  that  even  after  imposing  unit  norm  there  is  still  a  sign  indeterminacy  in  Q,  which  accounts 
for  the  two  solutions  Qi  and  Q2.  These  solutions  become  four  when  transformed  to  local  coordi¬ 
nates.  These  ambiguities  can  be  overcome  by  imposing  the  positive  depth  constraint  as  it  is  done 
in  the  definition  of  the  essential  manifold  [52,  55]:  in  fact,  out  of  the  four  different  combinations  of 
R  and  T,  only  one  corresponds  to  points  which  are  in  front  of  the  observer  [67,  23,  46,  19]. 

As  time  goes  by,  the  point  Q (t),  corresponding  to  the  actual  motion,  describes  a  trajectory  on 
E  (and  a  corresponding  one  in  local  coordinates): 

Q (<) »-  Q(f  + 1)  =  Q(t)  +  nQ(t). 

The  last  equation  is  in  fact  just  a  definition  of  the  right-hand  side,  since  we  do  not  know  «Q(t). 
For  now  we  will  consider  the  previous  equation  as  a  discrete  time  dynamical  model  for  Q  on  the 
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essential  manifold,  having  tiq  as  unknown  input.  If  we  accompany  it  with  the  essential  constraint, 
we  get 

f  Q(*  +  1)  =  Q(0  +  nQ(t)  Q  G  E 

S  0  =  Xy'(t),y(t) Q(0  (8) 

l  Yi  =  Xf  +  Hi  V  i  =  1 . . .  N 

where  n(t)  6  JV(0,  Rn).  Note  that  now  the  visual  motion  problem  is  defined  as  the  estimation  of 
the  state  of  the  above  model,  which  is  defined  on  the  essential  manifold.  As  it  can  be  seen  the 
system  is  “linear”  (both  the  state  equation  and  the  essential  constraint  are  linear  in  Q);  however,  E 
is  not  a  linear  space.  In  the  section  4  we  will  develop  a  general  tool  for  addressing  the  identification 
problem. 

3  Other  problems  which  may  be  formulated  as  ID  of  EDS 

In  the  present  section  we  consider,  as  an  example,  two  additional  problems  which  may  be  cast  in 
the  framework  of  identification  of  exterior  differential  systems  with  parameters  on  a  manifold. 

The  first  problem  is  “camera  self  calibration”,  which  consists  of  the  dynamic  estimation  of  the 
camera  model  along  with  the  motion  parameters  .  It  has  been  shown  [54]  that  the  problem  may 
be  formulated  as  an  extension  of  the  scheme  derived  in  the  previous  section  when  the  essential 
manifold  is  substituted  by  the  space  of  the  “fundamental  matrices”  [20,  54]. 

The  second  problem  is  the  recovery  of  the  direction  of  translation  using  subspace  methods:  [53] 
provides  a  way  of  formalizing  the  problem  as  identification  of  an  exterior  differential  system  with 
parameters  on  a  sphere  [26,  53]. 

3.1  Dynamic  self-calibration 

In  so  far  we  have  represented  the  camera  as  an  ideal  central  projection  of  unit  focal  length.  When 
the  camera  model  is  a  more  general  projective  transformation  in  HP2,  eq.  (5)  does  not  hold. 
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Figure  2:  Structure  of  the  motion  problem  on  the  Essential  Space 


However,  a  similar  constraint  may  be  derived  based  on  the  epipolar  geometry  as 


x'f  Fxi  =  0 


Vi  =  1...N. 


The  matrix  F  is  called  “fundamental  matrix”;  it  defines  the  relation  between  each  point  i  and  its 
corresponding  epipolar  line  [20].  If  the  camera  is  represented  (in  homogeneous  coordinates)  as  a 
3x4  matrix  A  |  0  where 

"  fsx  0  — *o 

A=  0  fsy  -jo 
0  0  1 

is  the  internal  parameter  matrix2,  then  it  can  be  shown  that 


Q  =  ArFA  6  E 


is  an  essential  matrix. 

Faugeras  et  al.  [20]  propose  to  estimate  the  matrix  F  from  the  constraint  (9),  and  then  impose 
the  structure  of  the  fundamental  matrix  (10)  a  posteriori  by  solving  a  set  of  polynomial  equations 
known  as  Kruppa  equations.  Such  equations  are  indeed  ill  conditioned,  and  the  scheme  is  very 
sensitive  to  noise.  Furthermore,  temporal  coherence  of  the  camera  model  is  not  exploited. 

If  we  substitute  (10)  into  (9)  we  have  a  model 

f  (A~r<QA-1Xj-)Tx/;  =  0  Q  e  E 
l  Yi  =  x;  +  rii  V  i. 

2f  is  the  focal  length,  (to,  jo)  the  coordinates  of  the  optical  center  and  (sj,  sy)  the  pixel  sizes  along  the  image 
plane  coordinates.  The  deviation  from  90°  of  the  angle  between  the  optical  axis  and  the  CCD  surface  is  usually  on 
the  order  of  1°,  and  we  may  therefore  neglect  it. 
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Estimating  the  camera  parameters,  along  with  rigid  motion,  may  then  be  formulated  as  identifi¬ 
cation  of  the  above  exterior  differential  model,  where  the  parameters  are  on  the  manifold  E  x  A  F, 
and  AF  is  the  set  of  affine  transformations  of  1R2  represented  in  homogeneous  coordinates.  This 
formulation  has  been  derived  in  [54]. 


3.2  Recovering  rigid  motion  using  subspace  methods 


1  2?«(x)  ] 

- 1 

t-a 

i  j 

- 1 

’-5"3T' 

G 

_ i 

where  .4(x,  V)  = 


Vi  -  xV3 
V2  -  yV 3 


and  V  €  S2  is  represented  in  local  coordinates  as  V(6,(f>).  B  = 


-xy  l  +  x2  -y 
—  1  —  y2  xy  x 

CdT,  where 


.  If  we  observe  N  points  we  may  write  x  =  C{V,  x)[^, . . .,  gU,0]r  = 


C(V,x)  = 


Ai 


Bi 


An  Bn 

Under  the  usual  rank  conditions,  we  may  compute  the  least  squares  approximation  of  d  as 


d  =  & 


Xi 


L  xjv 


=  U*x 


and  therefore  the  motion  field  specifies  the  constraint  [26] 

Cx(U,x)x  =  0 

where  X  indicates  the  orthogonal  complement.  Heeger  and  Jepson  [26]  propose  to  estimate  the 
direction  of  translation  by  minimizing  the  two  norm  of  the  above  constraint  over  V  €  S2.  They 
perform  such  a  minimization  by  extensive  search  over  all  possible  directions  6,  <f>. 

Indeed  it  is  immediate  to  see  [53]  that  the  problem  of  estimating  the  direction  of  translation 
may  be  rephrased  as  the  problem  of  identifying  the  following  exterior  differential  system,  with 
parameters  V  on  a  sphere : 

/  (^(U,  x)x  =  0  V  £  S2 
l  y f  =  Xi  +  n;  Vi. 

This  problem  can  be  solved  in  a  principled  manner  using  the  results  of  the  next  section,  without 
requiring  any  extensive  search  or  sampling  of  the  sphere.  See  [53]  for  more  details. 


4  Identification  of  nonlinear  implicit  systems  with  prediction  er¬ 
ror  criteria 

Suppose  {*(/)}  €  is  a  trajectory  on  a  linear  state-space,  which  is  subject  to  an  implicit  dynamic 
constraint  of  the  form 

h[x(t),dx(t),a]  -  0  *(0)  =  a;0  (11) 
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where  a  £  M  are  some  unknown  parameters  which  can  move  (slowly)  on  a  topological  (not  neces¬ 
sarily  smooth)  manifold.  Call  a  =  4>{a)  £  IRm  the  local  coordinates  correspondent  of  a.  Suppose 
we  are  able  to  measure  x  up  to  some  white,  zero-mean  gaussian  noise: 

y{t)  =  x(t)  +  n(t)  n  £  Af(0,  Rn)- 


We  are  interested  in  identifying  the  parameters  a  recursively  from  the  measurements  {y(t)}  based 
on  the  minimization  of  some  cost  function  of  the  prediction  error  (for  a  classical  treatment  of  PEM 
for  linear  explicit  models  see  for  example  [58]). 

A  common  paradigm  for  PEM  identification  consists  in  forcing  a  Kalman  Filter  to  work  as  a 
parameter  estimator.  The  state  of  the  filter  is  augmented  with  the  unknown  parameters,  which  are 
described  using  a  low  order  random  walk  model.  The  sequel  of  modeling  operations  is  described  as 


from  {*(<)}  and  {x  —  ax}  identify  a  via  observing 


f  a(t  +  1)  =  a(t )  +  na 
1  yl  =  ay +h 


where  y,  yl  are  noisy  measurements  of  a:,  a;  and  n  is  a  residual  which  can  be  characterized  in  terms  of 
the  noise  n.  Our  aim  is  to  generalize  this  paradigm  to  nonlinear  implicit  dynamics  and  parameters 
living  in  topological  manifolds.  In  the  following  we  will  consider  discrete  dime  dynamics,  which  fall 
in  the  same  scheme  described  above,  once  we  substitute  x,yl  with  x(t  -f-  l),y(t  -f  1). 

First  we  proceed  in  analogy  with  the  linear-explicit  case:  we  describe  the  local  coordinates  of  the 
parameters  as  first  order  random  walk,  and  use  the  dynamic  constraint  as  an  implicit  measurement 
constraint: 


f  a(t  +  1)  =  a(t)  +  na(t )  o(0)  =  a0 

l  h  [y(t)  -  n(t),  y(t  -  1)  -  n(t  -  1),  ^_1(a(<))]  =  0 


where  we  have  substituted  the  index  t  with  t  —  1  in  the  measurements  {y}  (or  equivalently  the 
estimator  runs  with  one  step  delay).  The  noise  process  {n(t)}  induces  a  residual  in  the  measurement 
equation:  if  we  approximate  x(t)  with  y(t),  in  general  we  will  observe  h  [ y(t ),  y(t  -  1),  a]  —  h,  where 
n  depends  on  n(t),n(t  -  1),  y(t),  y(t  —  1)  and  a.  This  residual  is  exactly  the  prediction  error  (or 
pseudo-innovation)  when  choosing  a  least-squares  criterion  in  the  PEM. 


Let  us  collect  the  measurements  into  a  vector  yT(t)  =  yT(t)  yT(t 
n(t)  =  [nT(t)  nT{t  —  1)]T.  Our  task  is  to  estimate  a  from  the  model 


1) 


T 

,  and  idem  with 


f  a(t  +  1)  =  a(t )  +  na(t )  a(0)  =  «0 

I  h  [y(t)  -  n(t),  V>_1(o:(t))]  =  0. 


In  order  to  follow  the  course  of  the  linear-explicit  case,  we  have  to  solve  a  number  of  problems: 

RnS(t  —  s)  RnS(t  -  s  +  1) 

RnS(t  —  s  -  1)  RnS(t  -  s) 

2.  the  error  n  does  not  appear  additively  in  the  measurement  equation 

3.  the  measurement  equation  is  nonlinear  and  implicit. 


1.  the  noise  n  is  not  white,  since  E[n(t)nT(s )]  = 


The  Extended  Kalman  Filter  (EKF)  [33,  8,  31]  is  a  general-purpose  extension  to  nonlinear  systems 
of  the  traditional  Kalman  Filter.  It  is  based  on  a  variational  model  about  the  best  current  trajectory. 
The  systems  is  linearized  at  each  step  around  the  current  estimate  in  order  to  calculate  a  correcting 
gain;  the  update  of  the  previous  estimate  is  then  performed  on  the  original  (nonlinear)  equations. 
In  order  to  solve  step  3.  we  need  to  further  extend  the  EKF  to  cope  with  the  implicit  measurement 
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constraint.  This  is  done  in  appendix  A.  We  call  the  result  Implicit  Extended  Kalman  Filter 
(IEKF);  some  variations  of  the  scheme  have  been  used  in  different  applications  in  the  last  years, 
see  for  example  [13].  The  derivation  is  based  on  the  simple  fact  that  the  variational  model  about 
the  current  trajectory  is  linear  and  explicit ,  so  that  the  a  pseudo-innovation  process  may  be  defined 
analogously  to  the  explicit  case.  Note  that  the  local  coordinates  chart  of  the  parameter  manifold  ip 
enters  into  the  measurement  equation,  and  therefore  it  is  differentiated  in  order  to  compute  the  gain. 
However,  the  update  equation  is  calculated  on  the  actual  nonlinear  model,  so  that  discontinuities 
of  the  derivative  of  ip -1,  which  may  happen  when  switching  from  one  chart  to  another,  are  well 
tolerated. 

The  derivation  of  the  IEKF  in  appendix  A  also  solves  step  2.  The  residual  of  the  measurement 
equation  n,  which  is  in  fact  the  pseudo-innovation  of  the  filter,  is  characterized  in  terms  of  n, 
provided  that  the  last  is  white,  zero- mean  and  uncorrelated  with  na. 

In  the  following  section  we  will  show  how  to  whiten  n  and  therefore  reduce  the  problem  to  a 
form  suitable  for  using  the  IEKF.  Later  we  will  see  how  the  problem  simplifies  by  assuming  that  n 
is  white. 

4.1  Uncorrelating  the  model  from  the  measurements 


Consider  a  first  order  expansion  of  the  measurement  equation  about  the  point  y(t),a(t ): 


h[y(t),ip  - 

-  D+(t)n(t)  -  D-(t)n(t  -  1)  =  0(||h||2)  2  0 

where  we  have  defined 

»+(<)  = 

/ dh[x(t),  x(t  -  1),  a]\ 

V  dx(t)  ) 

(14) 

DM)  = 

f  dh[x(t),x(t  -  1),  a]\ 

\  dx(t  —  1)  /  1  (or(t)) 

(15) 

Here  the  residual  n{t )  =  -D+(t)n(t)  -  D_(t)n(f  —  1)  is  clearly  correlated.  In  order  to  estimate  the 
dynamics  of  n(f),  we  may  insert  it  into  the  state  dynamics:  call  z{t )  =  n(t  -  1). 

(  a(t  +  1)  =  a(t)  +  na(t)  a(0)  =  a0 

<  z(t  +  1)  =  n(i)  z(0)  =  0  (16) 

1  0  =  h  [?/(/), ^_1(a(f))]  -  D-(t)z(t)  +  w(t) 

where  we  have  defined  w(t)  =  -D+(t)n(t).  Now  the  measurement  error  w  is  white;  however, 
it  is  correlated  with  the  model  error  v  =  [n^,nT]T.  We  may  therefore  project  the  model  error 
onto  the  span  of  the  measurement  error  H(w)  in  order  to  make  the  two  orthogonal.  We  define 
v(t)  =  v{t)-E[v{t)\H{w)].  Since  w(t),n(t )  and  na(t)  are  white,  it  is  easily  seen  that  E[v(t)]H(w)]  = 
l?[v(f)|w(t)]  =  S„wE“1tn(t).  Evw  and  En  are  variance/covariance  matrices.  If  we  define 


Q(t)  = 

Ra  0 

0  Rn 

(17) 

R(t)  = 

D+(t)Rn(t)Dl(t) 

(18) 

S(t)  = 

0 

-Rn(t)Dl(t) 

(19) 

it  is  easy  to  see  that  =  S(t)R  J(f);  furthermore  'Ey  =  Q(t )  =  Q(t )  +  S(t)R  1(t)ST(t). 

Now  v  =  v  —  SR~lw  is  by  construction  orthogonal  (uncorrelated)  to  w. 
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4.2  A  model  for  PEM  identification  of  nonlinear  Implicit  models 

In  the  previous  paragraph  we  have  derived  a  substitution  for  the  model  error  which  is  by  construc¬ 
tion  uncorrelated  with  the  measurement  error.  Therefore  we  may  write  a  new  model  which  satisfies 
the  conditions  of  appendix  A: 

{a(t  +  1)  =  a(t )  a(0)  =  op 

z(t  +  1)  =  K(t)  ( h  [</(*),  -  D-(t)z(t))  z( 0)  =  0  (20) 

0  =  h  [y(t),  ^_1(a(t))]  -  D-(i)z{t)  +  w(t) 

where  we  have  defined 

K(t)  =  RnWD+MD+ftRnWDKt))-1  (21) 

w(t)  =  -£>+(t)n(t).  (22) 

By  applying  the  results  of  appendix  A,  we  have  a  pseudo-optimal  PEM  identification  scheme 
described  by  the  following  iteration: 

Prediction  step 

(  d(t  +  1| t)  =  a(t\t )  d(0|0)  =  op 

<  z(t  +  l\t)  =  K(t)(h[y(t),a(t\t)\- D-(t)z(t\t))  £(0|0)  =  0  (23) 

{  P{t  +  l|t)  =  F(t)P(t\t)FT(t\t)  +  Q(t)  P(0|0)  =  Po 


where  F  = 


I  0 

o  D-m 


and  C(t )  = 


/  dhly,^-1  (a)]\ 

\  9a  /|a(i|t),ff(t)' 


Update  step 


dt(t  -j-  1|  t  -f-  1) 

<  z(t  +  l|t  +  1) 

.  P(t  +  l|t  +  1)  = 


where 


a(«  +  l|0 

z(t  +  l\t+  1) 


+  L(t  +  1)  (h  [y(t),  a(t  +  1|/)]  -  D-(t)z(t  +  l|t)) 


l)P(t  +  l|0rT(<  +  1)  +  L(t  +  l)D+(t)Rn(t  +  1  )Dl{t)LT(t  +  1) 

(24) 


L  =  PCT  A"1  (25) 

A  =  CPCT  +  D+(t)Rn(t  +  (26) 

T  =  I-LC  (27) 


Note  that  we  are  trying  to  estimate  a  process  {z(t)}  which  is  nearly  white  noise  (n(t)  is  correlated 
only  within  one  step).  Furthermore  if  we  expect  a  large  number  of  measurement  components  n, 
the  cost  in  updating  a  large  state  and  tuning  a  large  number  of  model- variance  parameters  may  be 
relevant.  In  practical  applications  the  approximation  n  as  white  noise  are  often  best  conditioned. 
In  the  following  section  we  show  how  the  structure  of  the  filter  simplifies  in  such  a  case. 

4.3  A  simplified  version:  approximate  Least  Squares  PEM  identification 

In  this  section  we  report  the  equations  of  the  parameter  estimator  which  are  obtained  supposing 
that  the  residual  h  is  white.  This  correspond  to  applying  the  results  of  appendix  A  directly  to  the 
model  of  eq.  (13): 
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Prediction  step 


f  a(t  +  l\t)  =  a(t\t) 

{  P(t  +  l\t)  =  P(t\t)  +  Ra(t) 


(28) 


d(0|Q)  =  «o 

P(  0|0)  =  P0 


Update  step 

(  a(t  +  1| t  +  1)  =  a(t  +  l|i)  +  L(t  +  1  )h  [y(t),i>-l(a{t  +  l|t)] 

1  P(t  +  l|t  +  1)  =  T(i  +  1  )P(t  +  l|*)rr(i  +  1)  +  L(t  +  1  )D+(t)Rn(t  +  1  )Dl(t)LT{t  +  1) 

(29) 


where  now  we  the  quantities  L,  A  and  I'  are  defined  according  to  appendix  A.  Note  that  we  have 
reduced  the  size  of  the  state  from  n  +  m  down  to  m. 


Detecting  outliers 

Note  that  each  component  of  the  pseudo-innovation  is  a  measure  of  the  consistency  of  each  datum 
with  the  current  parameter  estimates.  This  proves  useful  when  applied  to  the  motion  problem 
because  it  allows  us  to  easily  segment  the  scene  into  a  number  of  independently  moving  objects  [56]. 

4.4  An  iterative  scheme  for  computing  the  update 

The  IEKF  update  seen  in  the  previous  section  may  be  substituted  with  a  Gauss-Newton  iteration, 
as  it  is  customary  in  recursive  ID  of  linear  models: 

a(k  +  1)  =  a(k )  -  LNR(k)h(a(k )) 

where  Lnr  =  J'h(et(k ))  and  Jh  is  the  jacobian  of  h. 

Note  that  at  each  fixed  time  we  could  perform  a  Newton-Raphson  iteration  on  the  function 
h(y,a),  for  which  local  convergence  results  are  known  as  well  as  bounds  on  the  convergence  rate. 
This  suggests,  as  an  alternative  to  the  IEKF,  to  fix  t  and  perform  a  Newton-Raphson  iteration 
along  the  k  coordinate.  Once  this  is  done  we  propagate  the  estimate  across  time  with  an  iteration 
which  now  is  linear ,  and  has  all  the  desirable  asymptotic  properties. 

4.4.1  Iteration  at  each  fixed  time 

At  each  time  instant  a  new  set  of  measurements  y  becomes  available.  The  dynamic  constraint 
imposes 

h[y(t),a]  =  0  Vt 

Define  Tah  :  ]Rm  — >  lRAr  to  be  the  derivative  of  the  map  h  and  Jh(o)  the  Jacobian  matrix  calculated 
at  the  point  a.  Suppose  that  there  exists  some  a*  such  that  h(y(t ),  a*)  =  0  for  our  particular  (fixed) 
t.  Then  we  may  write  a  first  order  expansion  around  the  point  a*,  starting  from  some  point  op  (we 
neglect  time  indices  for  the  remainder  of  this  section);  the  resulting  iteration,  which  is  obtained  by 
neglecting  the  second  order  term  of  the  expansion,  is  defined  by 

h\ptk\  =  Jh{,®k)\pk- 1-1  — 

At  each  iteration  we  solve  for  Y  the  linear  problem 

Jh(ak)Y  =  h[ak] 
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and  then  define  =  a*  +  Y.  In  general,  also  due  to  noise,  we  can  expect  $  Jm(  J/l(ofc)), 

so  that  we  will  be  seeking  for  Y  such  that  Jh{otk)Y  is  the  projection  of  h[a *,]  onto  the  range  space 
of  Jh(ak): 

&k+l  = 

where  LNn(k)  =  (j^i^k) A(«fc))  Jh(al fc)-  The  map  defined  by  the  right-hand  side  of  the  above 
equation  is  contractive  as  long  as  Jh(®k)  has  full  rank,  in  which  case  the  scheme  is  guaranteed  to 
converge  to  some  (possibly  local)  minimum. 

At  each  time  the  scheme  will  converge  to  some  a*,  which  best  explains  the  noisy  measurements 
j/j,  y 1;  hence  we  have  a*  =  a  +  na  where  na  is  a  noise  term  whose  variance  can  be  inferred  from  the 
variance  of  n;  and  a  linearization  of  the  scheme  about  zero-noise.  The  estimate  obtained  at  each 
fixed  time,  together  with  its  variance,  is  fed  to  a  time-integration  step,  which  we  describe  next. 

4.4.2  Propagation  along  time:  disambiguation  of  local  minima 

At  each  fixed  time  the  iteration  along  k  converges  to  a  fixed  point  a*(t),  then  we  may  propagate 
the  information  across  time  with  a  similar  iteration: 

d(f  +  1)  =  a(t)  +  L(t)  [a*(t)  -  d(t)] 

which  implements  a  linear  Kalman  filter  based  upon  the  model 

a(t+l)  =  a(t)  +  n0(t)  (30) 

a*(t)  =  a{t)  +  na(t )  (31) 

where  no  is  the  error  of  the  random  walk  model  for  motion,  which  we  assume  to  be  white  zero-mean 
and  gaussian,  and  na  is  the  error  made  by  the  fixed-time  iteration.  L  is  the  usual  linear  Kalman 
gain  [33,  31].  The  above  model  has  all  the  desirable  properties,  as  it  satisfies  the  conditions  of  the 
fundamental  theorem  of  the  asymptotic  theory  of  Kalman  Filtering. 

Suppose  now  that  the  fc-iteration  has  converged  to  a  local  minimum,  which  is  compatible  with 
the  current  observations.  At  the  next  step  the  f-iteration  will  predict  an  estimate  which  is  in 
general  no  longer  compatible  with  the  current  observations.  This  should  help  to  disambiguate  local 
minima  as  the  measurements  accumulate  in  time. 

5  Application  to  the  visual  motion  estimation  problem 

We  have  seen  in  the  previous  sections  that  motion  estimation  may  be  regarded  as  estimation  of  the 
state  of  a  system  of  a  difference  equations  on  the  essential  manifold  having  unknown  inputs. 

The  first  approach  we  describe  consists  in  composing  equations  (8)  with  the  local  coordinate 
chart  $,  ending  up  with  a  nonlinear  dynamical  model  for  motion  in  H5.  At  this  point  we  have  to 
make  some  assumptions  about  motion:  since  we  do  not  have  any  dynamical  model,  we  will  assume 
a  statistical  model.  In  particular  we  will  assume  that  motion  is  a  first  order  random  walk  (brownian 
motion)  in  IR5  (see  fig.  3  left).  The  problem  then  becomes  that  of  estimating  the  state  of  a  nonlinear 
system  driven  by  white,  zero-mean  gaussian  noise.  This  will  be  done  using  the  technique  developed 
in  the  previous  section. 

In  the  second  approach  we  change  the  model  for  motion:  in  particular  we  assume  motion  to  be  a 
first  order  random  walk  in  R9  projected  onto  the  essential  manifold  (see  fig.  3  left).  We  will  see  that 
this  leads  to  a  method  for  estimating  motion  via  solving  at  each  step  a  linear  estimation  problem 
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Figure  3:  (Left)  Model  of  motion  as  a  random  walk  in  ]R5  lifted  to  the  manifold  or  as  a  random 
walk  in  !R9  projected  onto  the  manifold.  (Right)  Estimation  on  the  Essential  Space 

in  the  linear  embedding  space  and  then  “projecting”  the  estimate  onto  the  essential  manifold  (see 
fig.  3  right). 

It  is  very  important  to  understand  that  these  are  modeling  assumptions  about  motion  which 
can  be  validated  only  a  posteriori.  In  general  we  observe  that  the  first  method  solves  a  strongly 
nonlinear  problem  with  techniques  which  are  based  upon  linearization  of  the  system  about  the 
current  reference  trajectory,  so  that  the  linearization  error  may  be  relevant.  The  second  method 
does  not  involve  any  linearization,  while  it  imposes  the  constraint  of  belonging  to  the  essential 
manifold  in  a  weaker  way.  This  approach  has  indeed  a  very  transparent  structure  which  can  be 
studied  in  full  detail. 

The  third  method  is  based  upon  splitting  the  iteration  according  to  the  recursive  Gauss-Newton 
scheme  illustrated  in  the  previous  section. 

The  next  three  sections  are  devoted  to  describing  these  three  techniques.  Note  that  each  method 
produces,  together  with  the  motion  estimates,  the  variance  of  the  estimation  error,  which  is  to  be 
used  by  the  subsequent  modules  of  the  structure  and  motion  estimation  scheme  [57]. 


5.1  Local  coordinates  estimator 

Consider  composing  the  system  (8)  with  the  map  $: 


$  :  E 
Q 


S2  x  IR3  ~  IR5 


where  T  is  expressed  in  spherical  coordinates  for  radius  one,  for  convenience  of  representation. 
Then  the  system  in  local  coordinate  becomes 


£(*  +  1)  =  £(<)  +  H{t)  ;  £(i0)  =  & 
0  =  Xy(t)y(t)Q(£(0  +  »(*)• 


(32) 

(33) 


As  we  said  we  model  motion  {£}  as  a  first  order  random  walk:  n^(t)  £  Af(0,  R()  for  some  which 
is  referred  to  as  variance  of  the  model  error.  While  the  above  assumption  is  rather  arbitrary  and 
can  be  validated  only  a  posteriori,  it  is  often  safe  to  assume  that  the  noise  in  the  measurements 
y(t )  is  a  white  zero-mean  gaussian  process  with  variance  Rn. 


The  system  above  is  now  in  a  form  suitable  for  using  an  Implicit  Extended  Kalman  Filter  (EKF). 
Finally  the  equations  of  the  estimator  can  be  summarized:  call  C  =  (%p)  and  D  = 
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Prediction  step: 


(34) 

(35) 


Z(t  +  l|t)  =  ;  f(0|0)  =  £0 

P{t  +  l|i)  =  P(t|t)  +  Ri  ;  P(0|0)  =  P0 

Update  step: 


|(t  +  l|t+l)  =  i(t  +  l\t)  -  L(t  +  l)XQ(tit  +  l\t),t)  (36) 

P(t  +  l|t  +1)  =  I\t  -f-  1  )P(t  +  l|i)TT(f  +  1)  +  L(t  +  l)Rn(t  +  1  )LT(t  +  1)  (37) 

Gain: 

L(t  +  1)  =  P(t  +  l\t)CT(t  +  l)A"x(f  +  1)  (38) 

A(t+1)  =  C(t+l)P(t  +  l\t)CT{t  +  l)  + Rh(t  +  l)  (39) 

r(t  +  l)  =  /-I(i  +  l)C(t+l)  (40) 

Innovation  variance: 

Ra(t  +  1)  =  D(t  +  l)RnDT{t  +  1)  (41) 


Note  that  P(t\t)  is  the  variance  of  the  motion  estimation  error  which  is  used  as  variance  of  mea¬ 
surement  error  by  the  subsequent  modules  of  the  motion  and  structure  estimation  scheme.  This 
formulation  was  first  introduced  by  Di  Bernardo  et  al.  [13]  in  a  slightly  different  form. 

5.2  The  essential  estimator 

Suppose  that  motion,  instead  of  being  a  random  walk  in  St5,  is  represented  in  the  essential  manifold 
as  the  “projection”  of  a  random  walk  through  St9  (see  fig.  3).  The  “projection”  operator  onto  the 
space  E  is  denoted  by  pr<ij>(-): 

W<E>  '•  Bt3  x  3  —»■  E  (42) 

M  i  ►  Udiag{l,l,0}VT 

where  U,V  are  defined  by  the  Singular  Value  Decomposition  of  M  =  U£Vr.  The  fact  that 
this  operator  maps  onto  the  essential  manifold  is  proved  in  appendix  B.  Note  that  the  projection 
minimizes  the  Frobenius  norm  and  the  2-norm  of  the  distance  from  a  point  in  II3 x 3  to  the  essential 
manifold  [23,  41,  69]. 

Now  we  define  the  operator  ©  that  takes  two  elements  in  lR  ix3,  sums  them  and  then  projects 
the  result  onto  the  essential  manifold: 

©  :  H3x3  x  1R3x3  -►  E 

Ml, M2  t-»-  Q  =  pr<£>(Ml  +  M2) 

where  the  symbol  +  is  the  usual  sum  in  1R3x3.  With  the  above  definitions  our  model  for  motion 
becomes  simply 

Q(*  +  l)  =  Q(i)®»Q(i)  (43) 


16 


where  n,Q(t)  €  .A/(0,  Rnq)  is  represented  by  a  white  zero-mean  gaussian  noise  in  1R9.  If  we  couple 
the  above  equation  with  (8)  we  have  again  a  dynamical  model  on  an  euclidean  space  (in  our  case 
H9)  driven  by  white  noise.  The  Essential  Estimator  is  the  least  variance  filter  built  for  the  above 
model,  and  corresponds  to  a  linear  Kalman  filter  update  in  the  embedding  space,  followed  by  a 
projection  onto  the  essential  manifold.  Note  that  in  principle  the  gain  could  be  precomputed  offline, 
for  each  possible  configuration  of  motion  and  feature  positions. 

Prediction  step: 


Q(t  +  l\t)  =  Q(<|<)  ;  Q(G|0)  —  Q0  (44) 

P(t  +  l\t)  =  P(t\t)  +  Rq  ;  P(0|0)  =  P0  (45) 

Update  step: 

Q(t  +  l|f+l)  =  Q(t  +  l\t)®L(t  +  l)x(t)Q(t  +  l\t)  (46) 

P(i  +  l|t  +  l)  =  T(t  +  l)P(t  +  l\t)VT(t  +  1)  +  L{t  +  l)Rn(t  +  l)LT(t  +  1)  (47) 

Gain: 

L(t+ 1)  =  -P(<  +  l|0x(<)A_1(i  +  l)  (48) 

A(«+l)  =  x{t)P(t  +  l\t)X(t)  +  Rh(t  +  l)  (49) 

T(t+1)  =  I-L(t+l)X(t)  (50) 

Rn(t+ 1)  =  D(t  +  l)RnDT(t+l)  (51) 


5.3  2-D  fixed-point  estimator 

At  each  time  instant  a  new  set  of  measurements  becomes  available  in  the  form  of  position  of 
projected  points  onto  the  image  plane,  encoded  in  y(<).  The  essential  constraint  imposes 

xmm)  =  o  vt. 

The  Gauss-Newton  method  generates  the  iteration 

X(6)  =  Jx((kMk+ 1  -  6]- 

At  each  iteration  we  solve  for  Y  the  linear  problem 

Jx(ik)Y  =  X(tk) 

and  then  define  £*,+ 1  =  £k  +  Y. 

5.4  Identifiability  of  rigid  motion 

The  theoretical  observability  /identifiability  of  the  models  thus  refined  is  addressed  in  [52].  It  is 
proved  that  the  model  is  globally  observable  once  the  viewer  does  not  move  on  a  quadric  surface 
which  contains  all  the  visible  points.  Note  that  such  a  condition  is  satisfied  almost  always  due  to 
noise  in  the  measurements. 
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5.5  Further  issues 

Insofar  we  have  assumed  that  ||T||  ^  0.  It  may  be  shown  that  there  is  no  loss  of  generality 
in  this  assumption  [55].  In  fact,  due  to  the  noise  in  the  measurements,  there  will  be  always  a 
translation  compatible  (in  least  squares  sense)  with  the  observations.  The  scheme  automatically 
scales  translation  to  unit  norm  and  the  inverse  depth.  The  issue  is  discussed  in  [55]. 

The  scheme  may  be  further  extended  to  more  general  camera  models  in  order  to  estimate  camera 
internal  parameters  along  with  visual  motion  (camera  self- calibration).  See  [54]  for  details. 

The  essential  filters  are  used  in  deriving  a  scene  segmentation  method  based  on  3D  motion, 
which  proved  successful  in  extreme  experiments  such  as  the  segmentation  of  the  Ullmann  scene  of 
two  transparent  cylinders  counter-rotating  one  inside  the  other  [56]. 

6  Experimental  assessment 

We  have  tested  the  described  algorithms  on  a  variety  of  motion  and  structure  configurations.  We 
report  the  simulations  performed  on  the  same  data  sets  of  [57].  These  consist  of  views  of  a  cloud  of 
points  under  a  discontinuous  motion  with  singular  regions  (zero-translation  and  non-zero  rotation) 
and  are  described  in  [57].  Gaussian  noise  with  1  pixel  Std  has  been  added  to  the  measurements. 
Simulations  have  been  performed  with  a  variable  number  of  points  down  to  1  point  for  constant 
velocity  motion,  and  show  consistent  performance. 

The  local  coordinates  estimator 

In  fig.  4,5  we  show  the  three  components  of  translational  and  rotational  velocity  as  estimated  by 
the  local  coordinate  estimator.  Convergence  is  reached  in  less  than  20  steps.  Tuning  has  been 
performed,  as  with  the  other  schemes,  within  an  order  of  magnitude,  and  the  Std  of  the  estimation 
error  are  reported  in  the  tables  below.  It  must  be  pointed  out  that  we  have  observed  a  better 
behavior  by  increasing  the  variance  of  the  pseudoinnovation.  This  is  due  to  the  fact  that  the 
EKF  relies  on  the  hypothesis  that  the  measurement  noise  is  white  and  the  linearization  error  is 
negligible,  while  in  this  case  it  is  not.  Initialization  is  performed  with  one  step  of  the  traditional 
Longuet-Higgins  algorithm.  The  computational  cost  of  one  iteration  os  of  about  100  Kflops  for  20 
points. 

Note  that  if  we  have  available  some  dynamical  model  for  motion  we  may  easily  insert  it  into 
the  state  model. 

The  Essential  estimator 

In  fig.  8  we  show  the  9  components  of  the  essential  matrix  as  estimated  by  the  essential  estimator, 
convergence  is  4  times  slower  than  the  local  coordinate  version,  but  each  step  is  10  times  faster.  Note 
that  in  principle  the  gains  may  be  precomputed  offline,  for  each  possible  configuration  of  points  in 
the  image  plane.  We  have  noted  step-like  convergence  with  plateaus  followed  by  switching  regions. 
These  correspond  to  switching  of  the  first  two  eigenspaces  of  the  SVD  of  Q.  When  brought  to  local 
coordinates  we  have  estimates  for  rotation  and  translation  6,7.  It  is  noted  that  the  homeomorphism 
$  may  have  singularities  due  to  noise  when  the  last  eigenspace  is  changed  with  one  of  the  other 
two.  This  causes  the  spikes  observed  in  the  estimates  of  motion.  However,  note  that  there  is  no 
transient  to  recover,  since  the  errors  do  not  occur  in  the  estimation  step,  but  in  transferring  to 
local  coordinates.  The  switching  can  be  avoided  by  a  higher  level  control  on  the  continuity  of  the 
singular  values.  The  computational  cost  amounts  to  circa  41  Kflops  per  each  step  for  20  points. 
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Translational  Velocity:  local  coordinates  (solid)  vs.  truth  (dotted) 
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Figure  4:  Components  of  translational  velocity  as  estimated  by  the  local  coordinates  estimator. 
The  ground  truth  is  shown  in  dotted  lines. 


The  2-D  iteration 

The  performance  of  the  2-D  iteration  is  reported  in  fig.  9,10.  This  scheme  proved  very  accurate 
after  proper  initialization,  even  though  the  error  analysis  used  for  calculating  the  variance  of  the 
estimates  at  each  fixed  time  was  approximate.  Speed  may  be  adjusted  by  varying  the  number  of 
iterations  at  each  fixed  time.  We  have  noticed  that  this  converges  after  a  number  of  steps  between 
3  and  7.  The  cost  of  the  scheme  for  7  iterations  and  20  points  is  100  Kflops.  The  simulations 
reported  were  done  using  a  constant  variance  of  the  error  of  the  k-iteration,  and  hence  show  larger 
errors  than  the  other  schemes. 

We  now  summarize  the  performance  of  the  three  schemes:  mean  and  Std  are  computed  between 
time  30  and  50  for  the  local  coordinate  scheme  and  the  2-D  iteration,  while  between  time  150  and 
200  for  the  essential  estimator. 


Scheme 

Tx 

Ty 

TZ 

Local 

M:  .0002  Std:.0004 

M:-.0015  Std:  .0048 

M:  .0002  Std:  .0004 

Essential 

M:  3.9754E-5  Std:  .0001 

M:  .0017  Std:  .0013 

M:  .0002  Std:  .0001 

2-D 

M:  .376E-3  Std:  .0009 

M:  -.0835E-3  Std:  .0071 

M:  .2851E-3  Std:  .0009 

Scheme 

fix 

ily 

fiz 

Local 

M:.0008  Std:  .0022 

M:.0002  Std:. 0002 

M:-.0002  Std:. 0008 

Essential 

M:-.0008  Std:  .0004 

M:  3.9949E-6  Std:  .0002 

M:  -1.6107E-5  Std:  .0004 

2-D 

M:  .2156E-3  Std:  .0034 

M:  .2261E-3  Std:  .0006 

M:.0073E-3  Std:.0006 

Experiments  on  real  image  sequences 
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Rotational  Velocity:  local  coordinates  (solid)  vs.  truth  (dotted) 
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Figure  5:  Components  of  rotational  velocity  as  estimated  by  the  local  coordinates  estimator. 

We  have  tested  our  schemes  on  a  sequence  of  10  images  of  the  rocket  scene  (see  fig  11).  There 
are  22  feature  points  visible,  and  the  standard  deviation  of  the  location  error  on  the  image  plane 
is  about  one  pixel.  The  local  coordinates  estimator  has  a  transient  of  about  20  steps  to  converge 
from  arbitrary  initial  condition.  Hence  we  have  run  the  local  estimator  on  the  10  images  starting 
from  zero  initial  condition,  and  we  have  used  the  final  estimate  as  initial  condition  for  a  new  run, 
whose  results  we  report  in  figures  12-14.  We  did  not  perform  any  ad  hoc  tuning,  and  the  setting 
was  the  same  used  in  the  simulations  described  at  the  previous  paragraphs.  In  fig.  12  we  report 
the  6  motion  components  as  estimated  from  the  local  coordinates  estimator  and  the  corresponding 
ground  truth  (in  dotted  lines);  the  estimation  error  is  plotted  in  figure  13.  As  it  can  be  seen  the 
estimates  are  within  5%  error,  and  the  final  estimate  is  less  than  1%  off  the  true  motion.  Finally 
in  fig. 14  we  report  the  norm  of  the  pseudo-innovation  of  the  filter,  which  converges  to  a  value  of 
about  10-3  in  less  than  10  +  5  steps. 

In  this  experiment  we  have  used  the  true  norm  of  translation  as  the  scale  factor.  We  have  also 
run  simulations  in  which  the  scale  factor  was  calculated  by  updating  the  estimate  of  the  distance 
between  the  two  closest  features,  as  in  the  experiments  described  in  the  previous  paragraphs.  In 
this  case,  however,  convergence  is  slower,  as  the  innovation  norm  reaches  regime  in  about  20-25 
steps. 

7  Conclusions 

We  have  proposed  a  novel  formulation  of  the  visual  motion  problem  in  a  system  theoretic  framework 
as  the  identification  of  an  exterior  differential  system  with  parameters  on  a  topological  manifold. 
We  have  first  addressed  the  general  identification  problem  using  nonlinear  prediction  error  criteria, 
and  then  applied  the  results  to  the  visual  motion  problem.  We  have  shown  that  other  tasks  in 
computer  vision  may  be  formulated  as  the  identification  of  exterior  an  differential  system  with 
parameters  on  a  manifold  and  solved  in  a  principled  manner. 
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x  10'2  Translational  Velocity:  Essential  estimator  (solid)  vs.  truth  (dotted  ) 
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Figure  6:  Components  of  translational  velocity  as  estimated  by  the  essential  estimator.  Note  the 
spikes  due  to  the  local  coordinates  transformation.  Note  also  that  they  do  not  affect  convergence 
since  they  do  not  occur  in  the  estimation  process,  but  while  transferring  to  local  coordinates. 

The  proposed  schemes  prove  very  accurate  and  robust,  as  well  as  computationally  light,  as 
we  show  in  the  experimental  section.  Easy  extensions  of  the  schemes  allow  solving  the  camera 
self-calibration  problem  and  the  3D  motion-based  segmentation. 
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Rotational  Velocity:  Essential  estimator  (solid)  vs.  truth  (dotted) 
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Figure  7:  Components  of  rotational  velocity  as  estimated  by  the  local  coordinates  estimator.  The 
ground  truth  is  shown  in  dotted  lines.  Note  the  spikes  due  to  the  local  coordinates  transformation. 
Note  also  that  there  is  no  transient  to  recover  since  they  do  not  occur  in  the  estimation  process. 
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A  Extended  Kalman  Filtering  for  Implicit  Measurement  Con¬ 
straints 

We  are  interested  in  building  an  estimator  for  a  process  {a}  which  is  described  by  a  stochastic 
difference  equation 

a(t  +  1)  =  f(a(t))  +  v(t )  ;  a(f0)  =  «o 

where  v(t)  €  Af(0,Qv).  Suppose  there  is  a  measurable  quantity  x(t)  which  is  linked  to  a  by  the 
constraint 

h[a(t),  x(t)]  =  0  Vt.  (52) 

We  will  assume  throughout  /,  h  €  Cr  ;  r  >  1.  Usually  x  is  known  via  some  noisy  measurement: 

x(t)  =  y(t)  +  w(t)  :  w(t)  e  Af(0,  Rw)  (53) 

where  the  variance/ covariance  matrix  Rw  is  derived  from  knowledge  of  the  measurement  device. 
The  model  we  consider  is  hence  of  the  form 

/  a(t  +  1)  =  f{a(t))  +  v(t)  ;  a(t0 )  =  o0 

\h[a(t),y(t)  +  w(t)]  =  0  '  l  j 


Construction  of  the  variational  model  about  the  reference  trajectory 

Consider  at  each  time  sample  t  a  reference  trajectory  a(t)  which  solves  the  difference  equation 

a(t  +  l)  =  f(a(t)) 


and  the  jacobian  matrix 

W)),m)  =  (g)iw 

The  linearization  of  the  measurement  equation  about  the  point  ( a(t),y(t ))  is 

h[a(t),  x(i)]  =  h[a(t),  y(t )]  +  C{a,  y)(a(t)  -  a(t))  +  D(a ,  y)(x(t)  -  y(t))  +  0(£2) 


where 


=  (£ 
dh 

dx 

£2  =  {||a-a||M|x-y||2}. 

Exploiting  the  fact  that  h[a,  a:]  =  0,  calling  6a(t)  =  a(t)  -  a(t )  and  neglecting  the  arguments  in  C 
and  D,  we  have,  up  to  second  order  terms 

h[a(t),y(t)]  —  -C6a(t)  —  Dw(t). 


D(a,y ) 


*( 


^  la(t),y(t) 

^ !«(»),»( o 
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Prediction  Step 

Suppose  at  some  time  t  we  have  available  the  best  estimate  d(t|/);  we  may  write  the  variational 
model  about  the  trajectory  a(t)  defined  such  that 

a(t  +  1)  =  f(a(t))  ;  a(t)  =  a(t\t). 

For  small  displacements  we  may  write 

Sa(t  +  1)  =  F(a(t))Sa(t )  4-  v(t)  (55) 

where  the  noise  term  v(t)  may  include  a  linearization  error  component. 

Note  that  with  such  a  choice  we  have  <5d(f|t)  =  0  and  f>a(t  +  l\t)  =  F(a(t))6a(t\t)  =  0,  from 
which  we  can  conclude 

a(t  +  1|/)  =  a(t  +  1)  =  =  f(a(t\t)).  (56) 

The  variance  of  the  prediction  error  8a(t  +  1|/)  is 

P(t  +  1|#)  =  F(t)P(t\t)FT  (t)  +  Q  (57) 

where  Q  =  var(v).  The  last  two  equations  represent  the  prediction  step  for  the  estimator  and  are 
equal,  as  expected,  to  the  prediction  of  the  explicit  EKF  [33,  31,  8]. 

Update  Step 

At  time  i  +  1  a  new  measurement  becomes  available  y(t  + 1),  together  with  the  prediction  a(t  +  1  \t) 
and  its  error  variance  P(t  +  1|/).  Exploiting  the  linearization  of  the  measurement  equation  about 
a(t  +  1)  =  a(t  +  1|/),  we  obtain,  letting  a  i  a(t  +  1|£)  and  y  =  y(t  +  1), 

h[a,  y]  =  -C(d,  y)8a(t  +  1)  -  n(t  +  1)  (58) 

where  we  have  defined  n  =  D(a,y)w(t+  1).  This,  together  with  the  equation  (55)  defines  a  linear 
and  explicit  variational  model,  for  which  we  can  finally  write  the  update  equation  based  on  the 
traditional  linear  Kalman  filter: 

8a(t  +  l|f  +  1)  =  8a(t  +  l|t)  +  L(t  +  1)  [h[d,  y]  +  C8a(t  +  l|t)]  (59) 

where 

L(t  +  1)  =  -P(t  +  l\t)CTA~l(t  +  1) 

A(/+l)  =  CP(t\t)CT  +  Rn(t  +  1) 

P(t  +  l\t+l)  =  T(t  +  l)P(t  +  l\t)TT(t  +  l)  +  LRn(t  +  l)LT 
T  =  ( I-LC ). 

Since  8a(t  +  1  |i)  =  0  and  8a(t  +  1| t  +  1)  =  a(t  +  1| t  +  1)  ~  a(t  +  1  [i ),  we  may  write  the  update 
equation  for  the  original  model: 

ct{t  T  l|t  T  1)  =  ot{t  T  l|i)  T  L(t  T  l)h  [d(t  T  1|^)?  y(t  T  1)]  •  (60) 

In  this  formulation  the  quantity  h[a(t  +  lj<), y(i  +  1)]  plays  the  role  of  the  pseudo-innovation.  The 
noise  n  defined  in  (58)  has  a  variance  which  is  calculated  from  its  definition: 

Rn(t)  =  D(a,y)Rw(t)DT(a,y). 

The  implicit  Kalman  filter  was  used  by  other  researchers  such  as  Darmon  [12],  Faugeras  [38,  69] 
and  Heel  [27]. 
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B  Projection  onto  the  essential  space 

We  have  defined  the  projection  operator  onto  the  essential  manifold  without  proving  that  the  result 
is  in  fact  an  element  of  the  essential  manifold.  In  fact  the  following  theorem,  which  was  first  stated 
by  Faugeras  in  1990  [23,  41],  shows  that  a  characterizing  property  of  the  essential  manifold  is  that 
its  elements  have  two  non-zero  equal  singular  values  and  a  zero  singular  value. 

Theorem  B.l  . 

Let  Q  =  UT,Vt  be  the  SVD  of  an  element  of  GL( 3).  Then 

<5  G  £«$*£  =  £  0  =  diag{  A  A  0}  |  A  €  H+. 


Proof: 

(=>)  let  Q  =  RS\R  G  £0(3),  S  G  so(3);  cr(Q),  the  set  of  singular  values  of  Q,  is  such  that  &(Q)  — 
\Jv(QQT)-  Next  observe  that  QQT  =  RSSTRT  =  SST  =  - S 2 .  Also  V£  G  so(3)3!T  |  S  = 
(T A),  and  the  singular  values  of  S2  are  {0,  ||Tj|2,  ||T||2}.  Hence  if  Q  G  E,  it  has  two  equal 
singular  values  and  a  zero  singular  value. 

(<=)  let  Q  =  UTi qVt  for  some  orthonormal  U,  V  and  for  some  A.  Let  furthermore  Rz( f)  be  a 
rotation  of  ^  about  the  Z  axis,  then 

Q  =  H£  0VT  =  URz(~)tVtVRz(^)X> 0VT . 

Now  call  R  =  U Rz(^)TVt  and  S  =  VRz( f)£oPr;  it  is  immediate  to  see  that  RRt  — 
RT R  =  Is  and  ST  =  —5,  hence  the  claim.  Q.E.D. 
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x  10‘2  Components  of  Q:  Essential  estimator  (solid)  vs.  truth  (dotted) 
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Figure  8:  Components  of  the  essential  matrix  as  estimated  by  the  essential  estimator.  Note  that 
there  are  no  spikes  and  the  estimate  is  smooth.  Note  that  the  estimates  between  time  200  and  300 
are  not  significant,  as  the  ground  truth  (dotted  line)  is  scaled  to  zero. 
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Translational  Velocity:  2D  estimator  (solid)  vs.  truth  (dotted) 
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Figure  9:  Components  of  translational  velocity  as  estimated  by  the  double  iteration  estimator. 


Rotational  Velocity:  2D  estimator  (solid)  vs.  truth  (dotted) 
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Figure  10:  Components  of  rotational  velocity  as  estimated  by  the  double  iteration  estimator. 


30 


Figure  11:  One  image  of  the  rocket  scene. 


Motion  Estimates  for  the  rocket  sequence 


Figure  12:  Motion  estimates  for  the  rocket  sequence:  The  six  components  of  motion  as  estimated 
by  the  local  coordinates  estimator  are  showed  in  solid  lines.  The  corresponding  ground  truth  is  in 
dotted  lines. 
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Motion  error  components  for  the  rocket  sequence 


Figure  13:  Error  in  the  motion  estimates  for  the  rocket  sequence.  All  components  are  within  5% 
of  the  true  motion. 


x  lO*2  Norm  of  the  innovation  for  the  rocket  sequence 


Figure  14:  Norm  of  the  pseudo-innovation  process  of  the  local  estimator  for  the  rocket  scene. 
Convergence  is  reached  in  less  than  5  steps. 
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